The Structure of Factor Oracles

نویسندگان

  • Maxime Crochemore
  • Lucian Ilie
  • Emine Seid-Hilmi
چکیده

The factor oracle is a relatively new data structure for the set of factors of a string. It has been introduced by Allauzen, Crochemore, and Raffinot in 1999. It may recognize nonfactors (hence the name “oracle”) but its implementational simplicity and experimental behaviour are stunning; factor oracle based string matching has been conjectured optimal on average. However, its structure is not well understood. We take important steps in clarifying its structure by explaining how it can be obtained as a quotient of the trie of the set of factors. When seen this way, all known properties of the factor oracle become simple observations. Also, we introduce a framework where various oracles can be compared. The factor oracle is better than several natural ones obtained from the trie of the set of factors, the suffix and the factor automata, respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Error analysis of factor oracles

Factor oracles [1] constructed from a given text are deterministic acyclic automata accepting all substrings of the text. Factor oracles are more space economical and easy to implement than similar data structures such as suffix tree[6]. There is, however, some drawback; a factor oracle may accept strings not in the text, which we call a error acceptance. In this paper, we charactrize factor or...

متن کامل

Constructing Factor Oracles

A factor oracle is a data structure for weak factor recognition. It is an automaton built on a string p of length m that is acyclic, recognizes at least all factors of p, has m+1 states which are all final, and has m to 2m−1 transitions. In this paper, we give two alternative algorithms for its construction and prove the constructed automata to be equivalent to the automata constructed by the a...

متن کامل

title : Finding Maximal Repeats with Factor Oracles

Factor oracles, built from an input text, are automata similar to suffix automata, and accepting at least all substrings of the input text. In papers [LL00] and [LLA02], factor oracles are used to detect repeats on text. Although repeats found with these methods are not maximal, average error is very low and algorithm runs quite fast. In this paper, we present two ideas to improve accuracy of t...

متن کامل

Weak Factor Automata: Comparing (Failure) Oracles and Storacles

The factor oracle [3] is a data structure for weak factor recognition. It is a deterministic finite automaton (DFA) built on a string p of length m that is acyclic, recognizes at least all factors of p, has m+1 states which are all final, is homogeneous, and has m to 2m − 1 transitions. The factor storacle [6] is an alternative automaton that satisfies the same properties, except that its numbe...

متن کامل

Using Factor Oracles for Machine Improvisation

We describe variable markov models we have used for statistical learning of musical sequences, then we present the factor oracle, a data structure proposed by Crochemore & al for string matching. We show the relation between this structure and the previous models and indicate how it can be adapted for learning musical sequences and generating improvisations in a real-time context.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Int. J. Found. Comput. Sci.

دوره 18  شماره 

صفحات  -

تاریخ انتشار 2007